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Abstract- In this paper, we present a methodology that uses 
control signals provided through guided teleoperation to assist 
in the learning of new manipulation tasks. The approach 
incorporates haptic feedback that guides human behavior in 
performing a manipulation task using guidance forces derived 
from visual input data. The control signals provided by the user 
are then utilized by the robotic system to learn the control 
sequences necessary for task execution. A neural network 
learning method that incorporates historical information is 
utilized for the learning process. The primary focus of our 
approach is to develop a method to enable the robotic system to 
improve its ability to learn manipulation tasks, whether or not 
the instruction is provided by an expert or general user. The 
methodology is explained in detail, and results of a 
manipulation system learning an object-centering task is 
presented. 

I. Introduction 

Recent trends in robotic appHcations have made possible the 
inclusion of mobile robotics in everyday life. Technological 
advances in locomotion have progressed such that mobility 
in a human-centric world is both feasible and realistic. As 
manipulation in the unstructured human environment can 
increase the robots ability to perform needed tasks, 
manipulation seems to be the natural extension for enabling 
the further integration of robotics in our society. The 
difficulty though is that manipulation of everyday objects 
not only depends on object features such as size, surface 
texture, and weight, but also on the characteristics of the 
desired manipulation task. Humans have learned to deal with 
these difficulties by improving their abilities through the 
accrual of experiences. As such, humans naturally become 
an excellent reference for robots to learn from concerning 
objects and how to interact with them. This learning process 
can be achieved through teleoperative-based instruction, in 
which humans demonstrate manipulation activities by 
directly interfacing with the robot controller, and the robot, 
in turn, learns to generalize the actions needed to accomplish 
the new task. For example, a caregiver can provide basic 
instruction to a robot in performing manipulation tasks 
necessary in preparing food for an invalid in the home. 
Through generalization, the robot can then learn to perform 
the related operations despite the varying characteristics of 
the task. To successfully realize this teleoperative-based 
instruction capability, two primary challenges must first be 
addressed: 1) developing a low-cost, robust interaction 
method in which non-expert users can quickly and easily 



gain access to the capabilities of the system and 2) extracting 
relevant input^output control signals from the human user 
which enable the manipulation system to learn the control 
sequences necessary for task execution. By combining a 
vision-based haptic feedback approach with a learning 
methodology based on temporal data sequences, we believe 
that we can address these two issues and enable the robotic 
system to improve its ability to learn manipulation tasks, 
whether or not the instruction is provided by an expert or 
general user. 

Currently, few research efforts have focused on using 
haptic feedback extracted from environmental interaction to 
improve manipulation capability through learning. In [1], 
force feedback was introduced during the teleoperation 
sequence using potential-field methods, but slow 
computation times limited their effectiveness. In 
robot-assisted adaptive training [2], custom force fields were 
studied for human interaction with a robotic handle and in 
[3], vision-guided trajectory generation using a neural 
network was applied for grasping in a virtual environment. 
In medical applications, there have been a number of 
research efforts that incorporate haptics and manipulation. 
[4] involves utilizing a master controller and force sensors to 
detect a surgeon's hand movement during teleoperation, 
whereas [5] described research focused on vision-based 
haptic exploration. In fact, research such as found in [6-7] 
has shown the importance of haptic feedback in surgery 
tasks requiring manipulation. These studies document the 
performance increases that can be achieved when providing 
haptic feedback during teleoperation. In all these efforts 
though, utilizing haptic feedback to improve the learning of 
new manipulation tasks by the robotic system was not 
realized. 

To improve current capability for learning of new robot 
manipulation tasks through teleoperation, we present a 
methodology that utilizes visual inputs directly in producing 
force guidance data to assist human operation of a 
teleoperated manipulation system. By combining visual data 
and force feedback, human decision is aided by haptic 
feedback to provide control signals to a manipulator device, 
which is then used for learning the control sequences 
necessary for task execution. The following sections 
describe this approach for using haptically guided 
teleoperation to assist in the learning of new manipulation 



II. Algorithm: Learning New Manipulation Tasks 

A. Divided Force Guidance for Haptic Feedback 

A classical method found in path-planning applications is 
the potential field method in which forces are calculated that 
will repel a robot away from obstacles and attract a robot 
toward a designated goal location. Once these potential 
fields are calculated, the robot can theoretically navigate to a 
final goal position within an obstacle-strewn environment by 
transitioning along the force vectors. The goal of the haptic 
feedback system is to assist the user in implementing a new 
task through teleoperation such that the control signals 
provided by the user are not strongly dependant on operator 
expertise (since, as shown in [8], experience is closely linked 
with task performance when a user performs a teleoperated 
task). The potential field method can be adapted to provide 
haptic force feedback by producing potential-like forces that 
generate an attractive force as the user moves toward a 
given goal and generates a repelling force when the user 
moves away from the goal. The adaptation of the potential 
field method is required in order to first adjust for 
differences in object distances as well as object size. Of most 
importance though, is to account for the subtle differences 
that are typically found in the potential field method (such as 
small forces that might be generated when a robot is close to 
the goal). These forces must be emphasized since too small 
of a force generated in the haptic device will go unnoticed 
by the human user. 

With regards to haptic feedback, we would like to create a 
potential field that modulates the haptic feedback forces in 
order to assistively guide the user toward the target location. 
This is achieved by using a methodology we call divided 
force guidance [9], such that calculated forces are correlated 
based on size and distance from object determined by 
vision-based methods (Figure 1). Our basic assumption is 
that the feedback system consists of an eye-in-hand setup in 
which the camera is mounted to the end-effector of the 
manipulator arm (Figure 2). Using this configuration, visual 
data is first used to calculate the dimensions of the target 
object. As the user commands the manipulator device, the 
distance from the target (calculated based on the distance 
offset from the end-effector pointing direction) is determined 
and used to output a haptic feedback force. 

The object in the robot's coordinate frame 
I*o = (^o'>'<,5^o)^''^' ^^ observed by a camera and translated 
by T^^' into the camera coordinate frame P^ = {xl,yl)&R' . 
The positional difference between P^ and the center of the 

image plane ^cmter = ^Kenter^ylenter) ^ "^^ ^^ dlrCCtty UScd tO 

generate a guidance force F,^ = {F^,F^) G R^ in the joystick, 
after being translated by a function T^°l, . After acquiring the 
target object position, x- and y-directional forces are 
generated based on the following equations: 
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where o is the approach ratio (0<a <1) that is set based on 
manipulator speed and teleoperation time delays (e.g. the 
faster the manipulator movements, the smaller the approach 
ratio), kj,^ and k„^ are positive and negative coefficients 
which changes values between {low, high} for situations in 
which the object is identified as far, near, or too close, Xji,y 
and yji,y represents the x- and y-coordinates of the current 
joystick position, and (Xoffset, yojfsei) is a changing offset 
position that iteratively determines waypoints that approach 
the target object around which the haptic forces are 
generated. The guidance forces computed allow the haptic 
feedback system to influence the user to move the 
manipulator arm toward the object. Once a position is 
attained that is centered directly on the target location, this 
force is effectively set to zero. 
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Figure 2. Eye-in-hand configuration for extracting visual data used in 
divided force guidance approach. 



B. Learning through Haptically Guided Manipulation 

Given implementation of a new task by the user, the goal 
of the system is to learn the relevant control signals that 
enable execution of the new task without requiring 
interaction from the human. Although the task we present 
for learning in Section III is focused on object centering, the 
methodology proposed is designed to allow learning of 
manipulation tasks in which both the position and 
orientation of the end-effector is significant (such as pouring 
a cup of coffee or grooming with a hair brush). Thus, the 
learning algorithm is defined in the joint, versus Cartesian 
space. We note from the kinematics that define a 



manipulator system that the joint angles needed to command 
positioning of the end-effector are inherently coupled. To 
maintain end-effector position, changing the joint angle of 
any one joint requires an associated change in the kinematic 
chain of joint angles that lead to the last joint coimected to 
the end-effector. As such, any learning methodology that is 
utilized should account for this dependency in its 
implementation. We accomplish this by utilizing a 
hierarchically structured framework of feedforward neural 
networks whose structure correlates to the manipulator's 
kinematic chain of joint angles. In addition, since we desire 
to learn the sequence of control signals that enable execution 
of a task, we must incorporate the temporal dependencies 
inherent to the task. This feature is integrated into the 
learning algorithm by adding in previous joint angle values 
to the input pattern fed into the neural network. The final 
learning framework is thus represented as follows (Figure 
3): 
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re 3. Hierarchical Neural Network Structure 
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where P^ = (jc^,yU)represents the position of the object in 
the camera frame of reference, (l,...,A/^)is the respective 
position of the joint within the kinematic chain, B is the joint 
angle commanded during one step in the teleoperated task 
sequence, and ?e{0,r} is the respective step in the 
teleoperated task sequence, which terminates at step T . In 
this hierarchy, there is one level associated with each joint 
(resulting in a 5 -level structure for a 5-DOF manipulator 
arm). The input into each level consists of the position of the 
object, the learned joint angles extracted from the 
lower-level neural networks, and the current joint angle 
associated with that level (e.g. the angle output from the 
neural networks associated with joint 7 through N-1 and the 
angle of joint A^ would feed into the neural network at level 
A^. The output of each level consists of the next angle to 
command the respective joint. In this algorithmic 
implementation, the neural network at each level has one 
hidden layer and is trained in sequence using the standard 
backpropagation algorithm. 

By coupling haptically guided teleoperation with learning 
from temporal data sequences, we provide a methodology to 
learn new manipulation tasks from human interaction. Since 
visual servoing is a well-understood manipulation task, the 
following section discusses results of applying our method 
to this related object-centering task. Our objective is to show 
the feasibility of using haptically guided teleoperation such 
that an untrained user can instruct a robot to learn. 



III. EXPERIMENTAL RESULTS 

A. Test Setup 

Our manipulation system (Figure 4) consists of a 5-DOF 
manipulator arm (attached to a stationary PioneerSAT 
mobility platform), a USB camera, and a laptop for hosting 
of the robot controller. The Pioneer Arm is a relatively 
low-cost robot arm that is driven by six open-loop servo 
motors, providing 5 degrees-of-freedom with an end-effector 
capable of grasping objects up to 150 grams in weight. For 
acquisition of the visual data, we mount a small USB 
webcam on the gripper so our system can transmit the 
workspace view observed by the end-effector to the user. 
The maximum frame rate of the camera is approximately 30 
fps/sec with pixel resolution of 320x240. It also has a 
diagonal 54 degrees of field-of-view angle with focus range 
of 5cm to infinity. 

The haptic device (Figure 5) used by the human operator 
is a force-feedback joystick (Microsoft SideWinder2 force 
feedback joystick), which has a 16bit 25MHz on-board 
processor capable of delivering 100 different forces and 16 
programmable function buttons. The haptic device is also 
coupled with a user interface that receives the streaming 
images retrieved from the camera attached directly to the 
robot end-effector. 




Figure 5. Master Interface (Haptic + Vision) for Teleoperatic 



B. Results 
In many manipulation tasks, the first step in the control 



sequence is to position the arm such that the target object is 
located within reach of the end-effector. We denote this 
pre-grasping process as object-centering when movement in 
one of the planes is constrained (Figure 6). As the primary 
focus in this paper is to provide an evaluation of the learning 
methodology, we simplify the vision-processing 
requirements by placing a brightly colored object against a 
uniform background. This allows the use of thresholding 
techniques to extract size and distance parameters for 
calculation of the values needed by the divided force 
guidance and learning methods. We first validate our divided 
force guidance technique using our master interface by 
conducting 20 trials for five selected places for object 
centering. The visual data from the camera system mounted 
on the robot is analyzed every 33ms and a target object is 
acquired and tracked. The guidance force is then generated 
directly from the visual data to guide the human operator 
using the haptic control device to move the arm toward the 
top of the object. In these trials, the approach ratio is chosen 
as 0.5 and the dimensions of the object is (25mm, 30mm) as 
measured in the x-y plane. 






(a) Before (b) After 

Figure 6. Camera view of before and after Object Centering task. 
The arm is manipulated by a human operator who 'sees' through 
this view and 'feels' the guidance force directly generated from 
this visual data. The '+' mark indicates the size of the object, and 
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Figure 7 shows typical graphs of distance change in object 
centering when the force guidance is enabled and disabled. 
With the guidance force, we can see the arm is moving 
toward the object making a clean path, whereas without the 
force guidance the arm hesitates at the beginning (since the 
operator has to decide where to move the arm) and takes 
more time in centering (see Figure 8). The average time for 
object centering (for a certain target position) was 2.1 sec for 
the frrst 10 trials and 1.8 sec for the latter 10 when the 
guidance force was enabled, while it took 3.2 sec and 2.4 sec 
respectively when the force was off. We can see that the 
average time with the force guidance is 29% less in total, 
and 33% smaller in the first 10 trials. The common decrease 
in the latter 10 trials in both cases shows that the human 
operator "learned" to operate better, and we can also see that 
the average time in the latter part with force guidance was 
still 25%) faster than that without force guidance (see Table 
I). 
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Last 10 
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2383 ms 
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1956 ms 


2774ms 


29% Faster 



Although the task we present for showing the feasibihty of 
the approach is focused on object centering, the 
methodology proposed is designed to allow learning of 
manipulation tasks in which both the position and 
orientation of the end-effector is significant (such as pouring 
a cup of coffee or grooming with a hair brush). Future work 
will involve expanding the set of new manipulation tasks 
and fully assessing the capability of the system to 
extrapolate its knowledge to execute tasks w 
characteristics. 



We then tested our learning approach and show the results 
derived from applying our learning methodology to one of 
the object-centering task sequences. Figure 9 depicts the 
end-effector locations during the teleoperated task sequence. 
Figure 10 depicts the resulting task sequence learned by the 
manipulator (i.e. extracted from the trained neural network). 
By defining the error based on the difference between the 
end-effector position at the conclusion of the final step in the 
teleoperated versus learned sequence, the error is calculated 
at 4.4% for this case of the object-centering task. 
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V. CONCLUSIONS 

In this paper, we present a methodology that enables 
learning of new manipulation tasks using teleoperation 
guided by haptic feedback forces. The approach uses vision 
as a means of providing feedback data to the user during task 
execution. The corresponding temporal data sequence of 
joint angles is then utilized by the robotic system for 
learning using a hierarchical neural network structure. 



